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Understanding the Flow of Content in Summarizing HTML Documents 

APR Rahman, H Alam, R Hartono - int. Workshop on Document Layout Interpretation and its 2001 - 
science.uva.nl 

... The proposed system works in automatically summarizing live web content on the fly 
to ... This paper has presented a concept to summarize HTML documents based on ... 

Cited by; 8 - View as.HTML - Web.Searjch 

[ps] Domain-independent summarization of news 

LF Rau, R Brandow, K Mitze - Summarizing Text for intelligent Communication, 1994 - transfer.ik.fh-hannover.de 

... as opposed to hard-coding the special conditions that identify these documents. 
4 Conclusions Anes was an experiment in automatically summarizing news using a ... 

Cited by 22 - View, as HIML - Web Seai;ch 

Using automated classification for summarizing and selecting heterogeneous information 
sources - group of 6 » 

R Dolin, D Agrawaf, AE Abbadi, J Pearlman - D-Lib Magazine, 1998 - webdoc.sub.gwdg.de 
... and variability of sources increases, new ways of automatically summarizing, 
discovering, and ... irrespective of the structure of the actual data or documents. ... 

Cited by 20 - Cached - Web Search 

A method for automatically abstracting visual documents - group, of A>> 
ME Rorvig - Journal of the American Society for Information Science, 1993 - doi.wiiey.com 
... This article describes a method for automatically selecting key frames ... steady reductions 
in cost, methods for summarizing these documents have remained ... 

CMd by.1 7 ■• Web Search • BL Direct 

Content Extraction from HTML Documents 

AFR Rahman, H Alam, R Hattono - 1st Int. Workshop on Web Document Analysis (WDA2001), 2001 - 
cscJiv.ac.uk 

... With it, you can virtually produce PDF documents from any Windows application ... The 
proposed system works in automatically summarizing live web content on the fly ... 

Cited by. 1 3. ■• Vle^s MTML » Web Search 

Tracking and summarizing news on a daily basis with Columbia's newsblaster - 9r.Qup.0f. 6. >> 
KR McKeown, R Barzilay, D Evans, V ... - Proceedings of the Human Language Technology Conference, 2002 - 
cs.coiumbia.edu 

... 5. SUMMARIZING EVENTS All sets of clustered articles ... dependent on the type of documents 
in each ... summarization only [3], A router automatically determines the ... 

Cited by 35 View as HTML * Web Search 

Centroid-based summarization of multiple documents: sentence extraction, utility-based 

evaluation, ... - group. of .1.7 » 

DR Radev, H Jing, M Budzikowska - ANLP/NAACL Workshop on Summarization, 2000 - acLidc.upenn.edu 
... It summarizes clusters of news articles automatically grouped by a topic ... Vibhu Mittal, 
and Jaime Carbonell, Summarizing Text Documents: Sentence Selection ... 

CHMbyJlS - View as HTML - Web Search 
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Ubiquitous speech processing - group of 7 » 

S Furui, K Iwano, C Hon, T Shinozaki, Y Saito, S ... - ICASSP IEEE INT CONF ACOUST SPEECH SIGNAL 
PROCESS PROC, 2001 • ieeexplore.ieee.org 

... TRANSCRIBING, UNDERSTANDING AND SUMMARIZING UBIQUITOUS SPEECH DOCUMENTS 4.1 
Transcription ... have proposed a method of automatically summarizing speech, sentence ... 

Cite d by 8 - Web Search - 8L Direct 

OCELOT: a system for summarizing Web pages - group, of 3„>> 
AL Berger, VO Mittal - SiGIR, 2000 - portal.acm.org 

OCELOT: A system for summarizing web pages ... retrieval liter- ature [2] for automatically 
discovering words ... a length distribution on documents, which presumably ... 

Cited by. 59 - Web Search 

Web Page Summarization Using Dynamic Content - g rou p of 7 » 
A Jatowt, M Ishizuka - Proceedings of the 13 thlnternationa! World Wide Web ... - portaLacm.org 
... popularity of the Web, web documents should no ... which advocates extracting and 
summarizing changes from ... E., and Paris, C. Automatically summarizing web sites ... 
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Goooooooooog le I&* 



Result Page: 



123456789 10 Next 




Google Home - About Google - About Google Scholar 



©2006 Google 



http://scholar.googlexom/scholar?hl=en& 



3/21/06 



automatically summarizing documents modifying parts of speech - Google Scholar 



Page 1 of 2 




O,™ | aO* a ™« \ ■: . - ■:■,;.< Movancea ocnoi<n o< 

1JM1%' [automatically summarizing documents modifyij ^^m^Wi scholar preferences 



.:, a Advanced Scholar Search 



'of" is a very common word and was not included in your search, [details] 
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A Statistical Approach to Automatic Speech Summarization - group of 9 » 

C Hori, S Furui, R Maikin, H Yu, A Waibel - EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003 - 

hindawi.co.uk 

... The process of summarizing speech involves excluding recog ... a summarization score 
from an automatically transcribed sentence ... in all the training documents, and F ... 

C!M by, J 6 yigw.as.HTML • Web Search » BL.Direct 

An Integrated Approach to Semantic Evaluation and Content-Based Retrieval of Multimedia 
Documents - group of 8 » 

A Knoll, C Altenschmidt, j Biskup, HM Bluethgen, I ... - LECTURE NOTES IN COMPUTER SCIENCE, 1998 - 
Springer 

... in natural language to analyzing, summarizing and presentation ... descriptors are assigned 
automatically (on demand ... Content-Based Retrieval of Multimedia Documents ... 

Cfted. by. 22 • Web.Search ■■ BL Direct 

Towards Incorporating Scientific Literature into Biological Algorithms - group of 5 » 
D Copenhagen -jeffchang.com 

... Andrade and Bork 2000), and summarizing the results ... Automatically annotates protein 
sequences in TrEMBL using ... protein and gene names in biomedical documents. ... 

yiew„as.HTM.L - V#b„S.eaxch. 

Introduction to EuroWordNet - .group, of .4 » 

P Vossen - Computers and the Humanities, 1998 - Springer 

... it would still not automatically give us a good conceptual ... Summarizing, the modular 
multilingual design of the EWN-database ... The most important documents will be ... 

Cited by 67;- Web Search 

rpsi Developing and Evaluating a Document Visualization System for Information Management 

- aroup. of 2..>> 

B Hui - 2002 - cs.toronto.edu 

... our system is not intended for summarizing documents, both systems ... them out onto 
the map can be done automatically by ... IE system is a set of documents to process ... 

Cited. by. 1. *■ Vlew.as .HTML ■■ Web Search. - Ub[^jY.Se<ux:h 

Summarizing Encyclopedic Term Descriptions on the Web - group of 6 » 

A Fujii, T Ishikawa - Proceedings of the 20th International Conference on 2004 - acl.idc.upenn.edu 

Summarizing Encyclopedic Term Descriptions on the Web ... lated using the total size 

of the input documents. ... The goal of our research is to automatically compile a ... 

Cited .by. 6 - View as HTML - Web. Search 

MiTAP for Biosecurity: A Case Study - group, of 3>> 

L Damianos, J Ponte, S Wohiever, F Reeder, D Day, ... • Ai Magazine, 2002 - mitre.org 
... This was a key element in successfully modifying the Alembic NLP system for ... Figure 
8 Newsblaster automatically summarizes clusters of documents. ... 

Cited by 9 » Vie w as H TML Web Search » 8L Direct 
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[ps] A literature survey on information extraction and text summarization - group. of .3 >> 
K Zechner - Term paper, Carnegie Mellon University, 1997 - www-2.cs.cmu.edu 
... ambivalent: Early ideas and systems of automatically condensing and/or summarizing 
documents date back ... a list of items (in particular: documents, or parts ... 

Cited, by. 12 - View as HTML - Web Search 

Gist-it: Summarizing email using linguistic knowledge and machine learning - grou p of 8j» 
E Tzoukermann, S Muresan, J Kiavans - Proceeding of the HLT and KM Workshop, EACL/ACL 2001, 2001 - 
acl.ldc.upenn.edu 

... explore with genetic algorithms to automatically learn them ... OCELOTA system for 
summarizing web pages ... Salience-based content characterisation of text documents. ... 

Cited by. 7 - View as.HTML - Weh Seanch 

MiTAP. Text and Audio Processing for Bio-Security: A Case Study - gmjp.i?fl3_>> 

L Damianos, J Ponte, S Wohiever, F Reeder. D Day ( ... - PROCEEDINGS OF THE NATIONAL CONFERENCE 
ON ARTIFICIAL ... f 2002 - mitap.sdsu.edu 

... its support for easily and effectively combining automatically derived heuristics ... 
Summarizing Similarities and Differences Among Related Documents. ... 

Cited by 3 - View as.HTML - Web.Search - BL Djreci. 



Goooooooooog 1 € P> 

Result Page: 1 2 3 4 5 6 7 8 9 10 Next 



[automatically summarizing document 



Google Home - About Google - About Google Scholar 

©2006 Google 



http://scholar.google.com/scholart^ 



3/21/06 



Results (page 1): automatically summarizing documents 



Page 1 of 6 




Terms used automatically summarizing documents 



Sort results 
by 

Display 
results 



relevance 



* nrr^t ^ Search Tips 

!.??P.?.!?.f!?™'?.r.[!?.. ^i^J □ Open results in a new 
window 



Subscribe (Full Service) Register (Limited Service, Free) Login 

Search: ® The ACM Digital Library C The Guide 
automatically summarizing documents 



Feedback Report a problem Satisfaction 
survey. 

Found 29,049 of 171,143 

Try an Advanced Search 

Try this search in The. ACM. .Guute 



! Save results to a Binder 



Results 1 - 20 of 200 
Best 200 shown 



Result page: 1 2 3 4 5 6 7 .8 9 10 next 

Relevance scale □ Q H 



1 Technical papers: Towards topic-based summarization for interactive document 
i& viewing. 

™ : Achim Hoffmann, Son Bao Pham 

October 2003 Proceedings of the 2nd international conference on Knowledge capture 
K-CAP "03 

Publisher: ACM Press 

Full text available: ^.pdf(.120 : 31.KB) Additional Information: fu[l„cjtatiQ.Q, abstract, references, Indexjerms 

Our research aims at interactive document viewers that can select and highlight relevant 
text passages on demand. Another related objective is the generation of topic-specific 
summaries of texts as opposed to general purpose summaries. This paper introduces our 
notions of discourse structure tree and level-of-detail tree. Both structures are used to 
represent relevant aspects of a text segment for the above mentioned purposes. 
Furthermore, we introduce a Knowledge Acquisition Frame ... 



Keywords: knowledge acquisition, natural language processing 



2 InformatjonJus^ 

Regina Barzilay, Kathleen R. McKeown, Michael Eihadad 

June 1999 Proceedings of the 37th annual meeting of the Association for 
Computational Linguistics on Computational Linguistics 

Publisher: Association for Computational Linguistics 

Full text available: ^pdu807.9S KB) Additional Information: full citation, abstract, references, citings 

We present a method to automatically generate a concise summary by identifying and 
synthesizing similar elements across related text from a set of multiple documents. Our 
approach is unique in its usage of language generation to reformulate the wording of the 
summary. 



OCELOT: a system for summarizing Web pages 
Adam L Berger, Vibhu O. Mittal 

July 2000 Proceedings of the 23rd annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

._ „ . . .... Additional Information: full citation, abstract, references, citings, index 

Full text available: ' ' 1 
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®J>dfUJ„9 M3) terms 

We introduce OCELOT, a prototype system for automatically generating the "gist" of a 
web page by summarizing it. Although most text summarization research to date has 
focused on the task of news articles, web pages are quite different in both structure and 
content. Instead of coherent text with a well-defined discourse structure, they are more 
often likely to be a chaotic jumble of phrases, links, graphics and formatting commands. 
Such text provides little foothold for extractive ... 

Dissertation abstracts: Automatic summarization focusing on document genre and 

text structure 
Yohei Seki 

June 2005 ACM SIGIR Forum, Volume 39 issue l 
Publisher: ACM Press 

Full text available: ^ rxi#154.94 KB) Additional Information: full citation, abstract, index terms 

This dissertation proposes a new automatic summarization method focusing on document 
genre and text structure, and verifies its effectiveness. "Document genre" refers to the 
type of document, such as a diary or a report. "Text structure" refers to the functional 
aspects of the text and divides the text into sentence units or components, according to 
their functional roles. This type of structure includes both the components and their 
organization within the text of a ... 

Automatic text summarization based on the Giobal Document Annotation 
Katashi Nagao, Koiti Hasida 

August 1998 Proceedings of the 17th international conference on Computational 
linguistics - Volume 2 , Proceedings of the 36th annual meeting on 
Association for Computational Linguistics - Volume 2 

Publisher: Association for Computational Linguistics , Association for Computational Linguistics 

Full text available: ^ pdf{476.15 K3) Additional Information: full citation, abstract, references, citings 

The GDA (Global Document Annotation) project proposes a tag set which allows machines 
to automatically infer the underlying semantic/pragmatic structure of documents. Its 
objectives are to promote development and spread of NLP/AI applications to render GDA- 
tagged documents versatile and intelligent contents, which should motivate WWW (World 
Wide Web) users to tag their documents as part of content authoring. This paper 
discusses automatic text summarization based on GDA. Its main features are a ... 

Summarization: Topic themes for multi-document summarization 
Sanda Harabagiu, Finley Lacatusu 

August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available: pd#245.65 K3) Additional Information: full citation , abstract, references, index terms 

The problem of using topic representations for multi-document summarization (MDS) has 
received considerable attention recently. In this paper, we describe five different topic 
representations and introduce a novel representation of topics based on topic themes. We 
present eight different methods of generating MDS and evaluate each of these methods 
on a large set of topics used in past DUC workshops. Our evaluation results show a 
significant improvement in the quality of summaries based on topic ... 

Keywords: summarization, topic themes 



7 Scalable collection summarization and selection 
R. Dolin, D. Agrawal, E. El Abbadi 
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8 AnnotatiQD^ 

Katashi Nagao, Shigeki Ohira, Mitsuhiro Yoneoka 

August 2002 Proceedings of the 19th international conference on Computational 
linguistics - Volume 1 

Publisher: Association for Computational Linguistics 

Full text available: ^pdf(386:§i K3j Additional Information: MLpjMiQiL abstract, references 

This paper presents techniques for multimedia annotation and their application to video 
summarization and translation. Our tool for annotation allows users to easily create 
annotation including voice transcripts, video scene descriptions, and visual/auditory object 
descriptions. The module for voice transcription is capable of multilingual spoken language 
identification and recognition. A video scene description consists of semi-automatically 
detected keyframes of each scene in a video clip and ... 

9 Links for a better web: Enhanced web document summarization using hyperlinks 
J.-Y. Delort, B. Bouchon-Meunier, M. Rifqi 

^ August 2003 Proceedings of the fourteenth ACM conference on Hypertext and 
hypermedia 
Publisher: ACM Press 

Additional Information: M.cjtatiQ.Q, abstract, references, citings, index 



Full text available: mpdf(167.88 KB) 

" terms 

This paper addresses the issue of Web document summarization. As textual content of 
Web documents is often scarce or irrelevant and existing summarization techniques are 
based on it, many Web pages and websites cannot be suitably summarized. We consider 
the context of a Web document by the textual content of all the documents linking to it. 
To summarize a target Web document, a context-based summarizer has to perform a 
preprocessing task, during which it will be decided which pieces of informati ... 



Keywords: context, hyperlinks, summarization, web document 



10 Query-relevant summarization using FAQs H 
Adam Berger, Vibhu O. Mittal 

October 2000 Proceedings of the 38th Annual Meeting on Association for 

Computational Linguistics ACL 'GO 
Publisher: Association for Computational Linguistics 

Full text available: ^ pdfn90.25 K3) Additional Information: futl citation, abstract, references 

This paper introduces a statistical model for query-relevant summarization: succinctly 
characterizing the relevance of a document to a query. Learning parameter values for the 
proposed model requires a large collection of summarized documents, which we do not 
have, but as a proxy, we use a collection of FAQ (frequently-asked question) documents. 
Taking a learning approach enables a principled, quantitative evaluation of the proposed 
system, and the results of some initial experiments— on ... 
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July 2001 Proceedings of the 40th Annual Meeting on Association for Computational 
Linguistics ACL '02 

Publisher: Association for Computational Linguistics 

Full text available: || j|pdf(124.62 KB) Additional Information: full citation, abstract, references 

We present a document compression system that uses a hierarchical noisy-channel model 
of text production. Our compression system first automatically derives the syntactic 
structure of each sentence and the overall discourse structure of the text given as input. 
The system then uses a statistical hierarchical model of text production in order to drop 
non-important syntactic and discourse constituents so as to generate coherent, 
grammatical document compressions of arbitrary length. The system out ... 



12 A trainable document summarizer 



Julian Kupiec, Jan Pedersen, Francine Chen 

July 1995 Proceedings of the 18th annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 
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13 Multtdocument summarization via information extraction 

Michael White, Tanya Korelsky, Claire Cardie, Vincent Ng, David Pierce, Kiri Wagstaff 
March 2001 Proceedings of the first international conference on Human language 

technology research HLT '01 
Publisher: Association for Computational Linguistics 

Full text available: *g] pdf{72.44 KB) Additional Information: full citation, abstract, references 

We present and evaluate the initial version of RIPTIDES, a system that combines 
information extraction, extraction-based summarization, and natural language generation 
to support user-directed multidocument summarization. 

14 Document analysis 1 : Semantic thumbnails: a novel method for summarizing 
^ document collections 

^ Arijit Sengupta, Mehmet Dalkilic, James Costello 

October 2004 Proceedings of the 22nd annual international conference on Design of 
communication: The engineering of quality documentation 

Publisher: ACM Press 

Full text available: ^.pdf(197 : 95 _K3) Additional Information: full. citation, abstract, references, jndexlerms 

The concept of thumbnails is common in image representation. A thumbnail is a highly 
compressed version of an image that provides a small, yet complete visual representation 
to the human eye. We propose the adaptation of the concept of thumbnails to the domain 
of documents, whereby a thumbnail of any document can be generated from its semantic 
content, providing an adequate amount of information about the documents. However, 
unlike image thumbnails, document thumbnails are mainly for the cons ... 

Keywords: document semantics, document summarization, semantic web, thumbnails 
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A more and more generalized problem in effective information access is the presence in 
the same corpus of multiple documents that contain similar information. Generally, users 
may be interested in locating, for a topic addressed by a group of similar documents, one 
or several particular aspects. This kind of task, called instance or aspectual retrieval, has 
been explored in several TREC Interactive Tracks. In this article, we propose in addition to 
the classification capacity of clustering techn ... 

Keywords: Multidocument summarization, topic segmentation 



16 Automatic text representation, classification and labeling in European law 
Erich Schweighofer, Andreas Rauber, Michael Dittenbach 

^ May 2001 Proceedings of the 8th international conference on Artificial intelligence 
and law 
Publisher: ACM Press 

Full text available: ^.pdf(255 : 20. K3) Additional Information: Ml.citation., abstract, references, ,index.terms 

The huge text archives and retrieval systems of legal information have not achieved yet 
the representation in the well-known subject-oriented structure of legal commentaries. 
Content-based classification and text analysis remains a high priority research topic. In 
the joint KONTERM, SOM and LabelSOM projects, learning techniques of neural networks 
are used to achieve similar high compression rates of classification and analysis like in 
manual legal indexing. The produced maps of legal text co ... 

17 Summarization-based query expansion in information retrieval 
Tomek Strzalkowski, Jin Wang, Bowden Wise 

August 1998 Proceedings of the 17th international conference on Computational 
linguistics - Volume 2 , Proceedings of the 36th annual meeting on 
Association for Computational Linguistics - Volume 2 

Publisher: Association for Computational Linguistics , Association for Computational Linguistics 

Full text available: f|.pdtf 726,07 K3) 
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Publisher Site 

We discuss a semi-interactive approach to information retrieval which consists of two 
tasks performed in a sequence. First, the system assists the searcher in building a 
comprehensive statement of information need, using automatically generated topical 
summaries of sample documents. Second, the detailed statement of information need is 
automatically processed by a series of natural language processing routines in order to 
derive an optimal search query for a statistical information retrieval sys ... 

18 Trainable, scalable summarization using robust NLP and machine learning 
Chinatsu Aone, Mary Ellen Okurowski, James Gorlinsky 

August 1998 Proceedings of the 17th international conference on Computational 
linguistics - Volume 1 , Proceedings of the 36th annual meeting on 
Association for Computational Linguistics - Volume 1 

Publisher: Association for Computational Linguistics , Association for Computational Linguistics 

pdf(478.76KB) 

^Additional Information: full, .citation, abstract, references 
Publisher Site 

We describe a trainable and scalable summarization system which utilizes features derived 
from information retrieval, information extraction, and NLP techniques and on-line 
resources. The system combines these features using a trainable feature combiner 
learned from summary examples through a machine learning algorithm. We demonstrate 
system scalability by reporting results on the best combination of summarization features 
for different document sources. We also present preliminary results from ... 
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19 Generij^ using relevance measure and latent semantic analysis 

e& Yihong Gong, Xin Liu 

^ September 2001 Proceedings of the 24th annual international ACM SIGIR conference 
on Research and development in information retrieval 

Publisher: ACM Press 
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In this paper, we propose two generic text summarization methods that create text 
summaries by ranking and extracting sentences from the original documents. The first 
method uses standard IR methods to rank sentence relevances, while the second method 
uses the latent semantic analysis technique to identify semantically important sentences, 
for summary creations. Both methods strive to select sentences that are highly ranked 
and different from each other. This is an attempt to create a summa ... 

Keywords: generic text summarization, relevance measure, semantic analysis 



20 Scatter/Gather^ 

^ Douglass R. Cutting, David R. Karger, Jan O. Pedersen, John W. Tukey 
^ June 1992 Proceedings of the 15th annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 
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Document clustering has not been well received as an information retrieval tool. 
Objections to its use fall into two main categories: first, that clustering is too slow for 
large corpora (with running time often quadratic in the number of documents); and 
second, that clustering does not appreciably improve retrieval. We argue that these 
problems arise only when clustering is used in an attempt to improve conventional search 
techniques. However, looking at clustering as an informa ... 
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1 Technique for automatically correcting words in text 
^ Karen Kukich 

^ December 1992 ACM Computing Surveys (CSUR), volume 24 issue 4 
Publisher: ACM Press 

Full text available: W> pdff.6.23 MB) Additional Information: full citation, abstract references , citincjs, index 

terms, review 

Research aimed at correcting words in text has focused on three progressively more 
difficult problems:(l) nonword error detection; (2) isolated-word error correction; and (3) 
context-dependent work correction. In response to the first problem, efficient pattern- 
matching and n-gram analysis techniques have been developed for detecting strings that 
do not appear in a given word list. In response to the second problem, a variety of 
general and application-specific spelling cor ... 

Keywords: n-gram analysis, Optical Character Recognition (OCR), context-dependent 
spelling correction, grammar checking, natural-language-processing models, neural net 
classifiers, spell checking, spelling error detection, spelling error patterns, statistical- 
language models, word recognition and correction 



2 Fast detection of communication patterns in distributed executions 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 

Studies on Collaborative research 
Publisher: IBM Press 

Full text available: ^|| pd;74.21 MB) Additional Information: fall citation, abstract references, index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 
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Our research aims at interactive document viewers that can select and highlight relevant 
text passages on demand. Another related objective is the generation of topic-specific 
summaries of texts as opposed to general purpose summaries. This paper introduces our 
notions of discourse structure tree and level-of-detail tree. Both structures are used to 
represent relevant aspects of a text segment for the above mentioned purposes. 
Furthermore, we introduce a Knowledge Acquisition Frame ... 

Keywords: knowledge acquisition, natural language processing 
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^ April 1985 ACM SIGART Bulletin, issue 92 

Publisher: ACM Press 

Full text available: ||.pdf(8 J9 MS) Additional Information: MLcitatjon, abstract 

The papers in this special issue were compiled from responses to the announcement in 
the July 1984 issue of the SIGART newsletter and notices posted over the ARPAnet. The 
interest being shown in this area is reflected in the sixty papers received from over six 
countries. About half the papers were received over the computer network. 



5 Annotation 
Katashi Nagao, Shigeki Ohira, Mitsuhiro Yoneoka 

August 2002 Proceedings of the 19th international conference on Computational 
linguistics - Volume 1 

Publisher: Association for Computational Linguistics 

Full text available: "g| pdf(366.61 KB) Additional Information: full citation, abstract, references 

This paper presents techniques for multimedia annotation and their application to video 
summarization and translation. Our tool for annotation allows users to easily create 
annotation including voice transcripts, video scene descriptions, and visual/auditory object 
descriptions. The module for voice transcription is capable of multilingual spoken language 
identification and recognition. A video scene description consists of semi-automatically 
detected keyframes of each scene in a video clip and ... 

6 Faciaj.modejjng. .and„anjm^tjon I 
M± Jorg Haber, Demetri Terzopoulos 

W August 2004 Proceedings of the conference on SIGGRAPH 2004 course notes GRAPH 
'04 

Publisher: ACM Press 

Full text available: ^pdfQMSMS). Additional Information: MlctjatiQQ, afistact 

In this course we present an overview of the concepts and current techniques in facial 
modeling and animation. We introduce this research area by its history and applications. 
As a necessary prerequisite for facial modeling, data acquisition is discussed in detail. We 
describe basic concepts of facial animation and present different approaches including 
parametric models, performance-, physics-, and learning-based methods. State-of-the-art 
techniques such as muscle-based facial animation, mass-s ... 
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August 1999 Proceedings of the 22nd annual international ACM SIGIR conference on 
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Keywords: comparing interfaces for information access, field/empirical studies of the 
information seeking process, speech indexing and retrieval, user studies 



8 SpeechSMm 
^ Barry Arons 

^ March 1997 ACM Transactions on Computer-Human Interaction (TOCHI), volume 4 issue 
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Publisher: ACM Press 

Full text available- 115 df(1 03 MB' Additional Information: Ml cMtjon, .abstract, references, citings, index 
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Listening to a speech recording is much more difficult than visually scanning a document 
because of the transient and temporal nature of audio. Audio recordings capture the 
richness of speech, yet it is difficult to directly browse the stored information. This article 
describes techniques for structuring, filtering, and presenting recorded speech, allowing a 
user to navigate and interactively find information in the audio domain. This article 
describes the SpeechSkimmer system for interacti ... 

Keywords: audio browsing, interactive listening, nonspeech audio, speech as data, 
speech skimming, speech user interfaces, time compression 



9 Special issue on knowledge representation 
^ Ronald J. Brachman, Brian C. Smith 

^ February 1980 ACM SIGART Bulletin, issue 70 

Publisher: ACM Press 

Full text available: pdf{13.13 MB) Additional Information: full citation, abstract 

In the fall of 1978 we decided to produce a special issue of the SIGART Newsletter 
devoted to a survey of current knowledge representation research. We felt that there 
were twe useful functions such an issue could serve. First, we hoped to elicit a clear 
picture of how people working in this subdiscipline understand knowledge representation 
research, to illuminate the issues on which current research is focused, and to catalogue 
what approaches and techniques are currently being developed. Secon ... 

10 Building searchable cojiections of enterprise speech dat^ 
James Cooper, Mahesh Viswanathan, Donna Byron, Margaret Chan 

^ January 2001 Proceedings of the 1st ACM/IEEE-CS joint conference on Digital 
libraries 
Publisher: ACM Press 

Full text available: ^ pdf(356.53 KB) Additional Information: full citation, abstract, references, index terms 

We have applied speech recognition and text-mining technologies to a set of recorded 
outbound marketing calls and analyzed the results. Since speaker-independent speech 
recognition technology results in a significantly lower recognition rate than that found 
when the recognizer is trained for a particular speaker, we applied a number of post- 
processing algorithms to the output of the recognizer to render it suitable for the Textract 
text mining system. We indexed the call transcri ... 

Keywords: document display, search, speech analysis, speech retrieval, text mining 
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11 information fusion in the context of multi-document summarization 
Regina Barzilay, Kathleen R. McKeown, Michael Elhadad 

June 1999 Proceedings of the 37th annual meeting of the Association for 
Computational Linguistics on Computational Linguistics 

Publisher: Association for Computational Linguistics 

Full text available: ^| pdff907.95 KB) Additional Information: full citation, abstract references, citings 

We present a method to automatically generate a concise summary by identifying and 
synthesizing similar elements across related text from a set of multiple documents. Our 
approach is unique in its usage of language generation to reformulate the wording of the 
summary. 
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Jae-Hoon Kim, Joon-Hong Kim, Dosam Hwang 

November 2000 Proceedings of the fifth international workshop on on Information 
retrieval with Asian languages 

Publisher: ACM Press 

Full text available: pdfif605.38 KB} Additional Information: full citation, abstract, references, citings 

In this paper, each document is represented by a weighted graph called a text 
relationship map. In the graph, each node represents a vector of nouns in a sentence, an 
undirected link connects two nodes if two sentences are semantically related, and a 
weight on the link is a value of the similarity between a pair of sentences. The vector 
similarity can be computed as the inner product between corresponding vector elements. 
The similarity is based on the word overlap between the corresponding s ... 

Keywords: Korean noun extraction, Korean text summarization, aggregate similarity 
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Web pages often contain clutter (such as pop-up ads, unnecessary images and 
extraneous links) around the body of an article that distracts a user from actual content. 
Extraction of "useful and relevant" content from web pages has many applications, 
including cell phone and PDA browsing, speech rendering for the visually impaired, and 
text summarization. Most approaches to removing clutter or making content more 
readable involve changing font size or removing HTML and data components such as 
imag ... 
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16 Regular papers: PiaSumm: flexible summarization of spontaneous dialogues in | 
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Klaus Zechner, Alex Waibel 

July 2000 Proceedings of the 18th conference on Computational linguistics - Volume 
2 

Publisher: Association for Computational Linguistics 

Full text available: ^ pdff 599.63 KS) Additional Information: full citation, abstract, references 

In this paper, we present a summarization system for spontaneous dialogues which 
consists of a novel multi-stage architecture. It is specifically aimed at addressing issues 
related to the nature of the texts being spoken vs. written and being dialogical vs. 
monological. The system is embedded in a graphical user interface and was developed 
and tested on transcripts of recorded telephone conversations in English and Spanish 
(CALLHOME). 

17 Conference abstracts | 
->|s January 1977 Proceedings of the 5th annual ACM computer science conference 

Publisher: ACM Press 

Full text available: pdf(3.14 MS) Additional Information: full citation, abstract, index terms 

One problem in computer program testing arises when errors are found and corrected 
after a portion of the tests have run properly. How can it be shown that a fix to one area 
of the code does not adversely affect the execution of another area? What is needed is a 
quantitative method for assuring that new program modifications do not introduce new 
errors into the code. This model considers the retest philosophy that every program 
instruction that could possibly be reached and tested from the ... 
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^ July 1991 ACM SIGCHI Bulletin, volume 23 issue 3 
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Computer-supported cooperative work (CSCW) is a new multi-disciplinary field with roots 
in many disciplines. Due to the area's youth and diversity, few specialized books or 
journals are available, and articles are scattered amongst diverse journals, proceedings 
and technical reports. Building a CSCW reference library is particularly demanding, for it is 
difficult for the new researcher to discover relevant documents. To aid this task, this 
article compiles, lists and annotates some of the curren ... 
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This paper consists of three interrelated parts. In the first part forms are intoduced as an 
abstraction and generalization of business paper forms. A set of facilities for the 
manipulation of forms and their contents is outlined. Forms can be created, stored, found, 
viewed in different media, mailed, and located by office workers. Data on forms can also 
be processed in a completely integrated way. The. facilities are discussed both abstractly 
and in relation to a prototype ... 

Keywords: database management, office modeling, office procedures 
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This paper presents new approaches to headline generation for English newspaper texts, 
with an eye toward the production of document surrogates for document selection in 
cross-language information retrieval. This task is difficult because the user must make 
decisions about relevance based on (often poor) translations of retrieved documents. To 
facilitate the decision-making process we need translations that can be assessed rapidly 
and accurately; our approach is to provide an English head ... 
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